Document image analysis with cooperative interaction between layout analysis and logical structure analysis

نویسنده

  • Yasuto Ishitani
چکیده

When a printed document is to be input to a computer system, the document must be converted to a computer-readable format, e.g., ASCII, PDF, RTF, CSV, or SGML/XML/HTML-tagged data. In order to obtain these data formats from a printed document, it is necessary to extract from the printed document as much information as possible, i.e., layout structure (layout objects and their hierarchical relationship), logical structure (logical objects and their reading order), and contents (text as OCR results). Logical structure analysis is concerned with attaching logical meanings to the layout structure and determining the reading order. The logical structure may be extracted hierarchically from the layout structure that is extracted by the layout analysis process from a document image. In this hierarchical analysis, some ambiguities cannot be reduced in the result obtained by a single functional process due to insufficient information. For example, sparse, irregular, or unconfined layout styles such as unpredictable non-text elements, program lists, mathematical expressions, or tables without ruled lines might not be analyzed correctly using only geometric information in the layout analysis process. If processes are connected sequentially in an entire system, system performance deteriorates due to accumulation of the errors of all the processes. In this paper, the author proposes a new document image analysis method with cooperative interaction between layout analysis and logical structure analysis to resolve the above problems. The proposed method has the following three advantages compared with other methods. (1) It can extract logical structures from imperfect layout structures due to large variation in formats, noise, or erroneous feature detection. Erroneous layout structure is automatically detected and is resolved by cooperation between layout analysis and logical structure analysis. (2) It breaks through previous limitations in layout analysis capabilities, because high-level logical information can be used in layout analysis. After this layout analysis, the global column structure of a document is extracted from the logical structure. (3) It can extract the logical structure of a document accurately, because the information to be used for logical structure analysis is obtained from the orderly column structure and is made more accurate by the cooperative interaction between layout analysis and logical structure analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Knowledge through Multi-modal Association Rule Mining for Document Image Analysis

The paper introduces a descriptive data mining method to discover knowledge for the task of automatic categorization in document image analysis. We argue that a document image is a multi-modal unit of analysis whose semantics is deduced from a combination of textual content, layout structure and logical structure. So, the method considers simultaneously different modalities of document represen...

متن کامل

UW-ISL Document Image Analysis Toolbox: An Experimental Environment

A document image analysis toolbox, including a collection of data structures and algorithms to suppbrt a variety of applications, is described in this paper. An experimental environment is built to allow developers to develop, test and optimize their algorithms and systems. Appropriate and quantitative performance metrics for each kind of information a document analysis technique infers have be...

متن کامل

Layout Based Information Retrieval from Document Images

This research is intended to develop a layout based retrieval system for document image databases consisting of three phases: 1. At first, intelligent layout analysis algorithm has been designed to extract the layouts the document images physically with their edges and rectangles. 2. Every physically identified layout has been converted into a tree intermediary representation for indexing and s...

متن کامل

Persian Printed Document Analysis and Page Segmentation

This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...

متن کامل

An integrated approach to document decomposition and structural analysis

A document image is a visual representation of a paper document, such as a journal article page, a cover page of facsimile transmission, ooce correspondence, an application form, etc. Document image understanding as a research endeavor consists of developing processes for taking a document through various representations: from scanned image to semantic representation. This paper describes docum...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999